Introduction and Overview

EC320, Lecture 1

Dante Yasui

06 2024

Prologue

Motivation

What is the goal of econometrics?

To learn about the world using data.

Why do economists (and others) study econometrics?

Providing answers to important problems. Gaining confidence in our understanding of the world.

Ex.

  • Do minimum wage policies reduce poverty?
  • Does the death penalty deter violent crime?
  • What are the harms of toxic pollutants like lead?
  • What explains the gender pay gap?
  • Can we forecast the next recession?
  • Who gains and who loses from globalization?

Motivation

What is the goal of econometrics?

To learn about the world using data.

Why do economists (and others) study econometrics?

Providing answers to important problems.

How do you pronounce it?

Motivation

Why should you study econometrics?

Develop skills and learn to use tools that are valued by employers.

Cultivate a healthy sense of skepticism

Develop and feed your own curiosity about the world.

IMO1, of all the courses in a typical economics major, econometrics is the most translatable to a job

  • Data is the new oil
  • Extracting meaningful analysis from big data is a sought after skill in the job market of 2024

Motivation

Why should you study econometrics?


Throughout this course, I will try my best to emphasize why:

  • Why are we learning this?
  • Why does this matter with regard to future econometrics courses?
  • Why is fill in the blank important for answering important problems?
  • Why does this matter to employers?


Econometrics is built on crucial fundamentals. These fundamentals is the focus of this class.

uk · kaa · nuh · meh · truhks

Most econometric inquiry concerns one of two distinct goals:

  1. Prediction: Accurately predict or forecast an outcome given a set of predictors. Given what we know about \(x\), what values do we expect \(y\) to take?
  1. Causal identification: Estimate the effect of an intervention on an outcome. How does \(y\) change when we change \(x\)?

In this class, and in EC 421, we will focus on the later. The former is the focus of EC 422 and EC 524

In EC320


We start to build up the fundamentals of causal analysis


But first we need to build up the necessary Theory, Tools, and Skills


This course will focus almost exclusively on a particular method that is common in statistics in general:


  • Ordinary Least Squares (OLS) (aka linear regression)

Causal Identification

Causal identification

Common refrain.1

“Correlation does not necessarily imply causation.”


Why might correlation fail to describe a causal relationship?

  • Omitted-variables bias
  • Selection bias
  • Simultaneity
  • Reverse causality
  • Coincidence

Causal identification

Common refrain.1

“Correlation does not necessarily imply causation.”

Correlation may imply causation if we assume “all else equals”

  • Hold everything fixed

This assumption is fragile in the real world.


Solutions:

  • Conduct experiments
  • Find a natural experiment

Do you think this is a causal statement?

Experiments

How can we ensure the all else equals assumption holds?

Randomization

Randomized Controlled Trails (RCT)

  • widely used across many scientific disciplines1
  • often touted as the gold standard of causal identification
  • use randomization to ensure all else equals

In 2019, the Nobel Prize winners adapting RCTs to projects in development economics2

Experiments Ex.

Research question

Does health insurance improve health?

The all else equals assumption would require:

  • all preexisting correlates with health must be the same across insured and uninsured

What would violate this assumption?

If more money is correlated with better health, and the average income of those who buy health insurance is higher, then we violate this assumption

Experiments Ex.

But what if health insurance is randomly assigned?

  • Then, assuming the assignment is perfectly random across a large enough sample size, this assumption becomes much more palatable

Oregon Health Insurance Experiment

The Oregon Health Insurance Experiment is a landmark study of the effect of expanding public health insurance on health care use, health outcomes, financial strain, and well-being of low-income adults… In 2008, the state of Oregon drew names by lottery for its Medicaid program for low-income, uninsured adults, generating just such an opportunity. This ongoing analysis represents a collaborative effort between researchers and the state of Oregon to learn about the costs and benefits of expanding public health insurance.

Natural experiments

An external, non-experimental factor creates circumstances that resemble a controlled experiment


Real-world events provide opportunity to compare similar groups


With some assumptions, researchers infer the causal relationships examining differences in outcomes between groups

Natural experiments

Any examples of natural experiments that come to mind?

Here are some of the more famous ones:

  1. Vietnam draft lottery
  1. The Mariel Boatlift
  1. Maimonides’ Rule and Class Size
  1. The Opening of the London Congestion Charge

In more recent news:

EC320

Syllabus

(click here)

Coursework

Rough outline of topics:

  • 01: Introduction and review
  • 02: The econometric problem
  • 03: SLR estimation
  • 04: SLR assumptions
  • 05: SLR inference
  • Midterm after 2nd week
  • 06: MLR estimation
  • 07: MLR inference
  • 08: Transformations
  • 09: Qualitative variables
  • 10: Heteroskedasticity
  • Final after 4th week


Final Due: Sunday, July 21st @ 08:00a

Course site

I use GitHub to host a separate site with all the course materials

You can find a link to it here or on the Canvas homepage


I use it because:

  1. I am adapting from previous class github repo
  2. it is convenient to track changes and keep everything up to date
  3. acts as a secondary site in case Canvas dies

I will try my best to keep Canvas up to date, but if you can’t find something there check the github.

All video recordings will only be available on Canvas

About me

Please call me Dante

  • Office hours:
    • Tuesdays: 10-11am PST
    • Thursdays: 3-4pm


> Metrics

  • I ❤️ studying econometrics
  • My first time teaching EC320 but I TAed for this class twice
  • Instructor: EC327 Winter 2024

About me

Please call me Dante

  • Office hours:
    • Tuesdays: 10-11am PST
    • Thursdays: 3-4pm


> Grad school

  • 3rd year Econ PhD student
  • Applied topics related to migration and historical data
  • Causal inference, statistical learning, and data science

About me

Please call me Dante

> Before grad school

  • Grew up in Davis, CA
  • Traveled around CA for gymnastics and quizbowl competitions in high school
  • Studied Economics and Japanese at college in Portland, OR
  • Prior to PhD, researched economic history and geography

In EC320

An applied econometrician1 needs a solid grasp on (at least) three areas:

  1. The underlying theory (assumptions, strengths, weaknesses).
  2. An ability to load, aggregating, joining, visualizing large datasets.
  3. Applying the theoretical methods to actual data.

This course aims to deepen your knowledge in each of these three areas.

  • 1: Analytical skills (Math)
  • 2-3: Computational tools ()

What is ?

To quote the project website1

R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS.

What does that mean?

  • was created for the statistical and graphical work required by econometrics –written by statistical programmers

  • has a vibrant, thriving online community. (stack overflow)

  • Plus it’s free and open source

Why are we using ?

1. is free and open source —saving both you and the university money.


2. Related: Outside of a small group of economists, private- and public-sector employers favor over Stata and most competing softwares.


3. is very flexible and powerful —adaptable to nearly any task, e.g., ’metrics, spatial data analysis, machine learning, web scraping, data cleaning, website building, teaching. I write all my slides, problem sets, and exams in R.

Why are we using ?

4. Related: imposes no artificial restrictions on your amount of observations, variables, memory, or processing power.


5. If you put in the work, 1 you will come away with a valuable and marketable tool.


6. Learning R and the tidyverse will help you to learn other useful tools like SQL, python, etc.

Cool examples of what you can do with

# The package for animating ggplot2
library(gganimate); library(gapminder)
# the ggplot code used to plot the data
ggplot(
  data = gapminder %>% filter(continent != "Oceania"),
  aes(gdpPercap, lifeExp, size = pop, color = country)
) +
geom_point(alpha = 0.7, show.legend = FALSE) +
scale_colour_manual(values = country_colors) +
scale_size(range = c(2, 12)) +
scale_x_log10("GDP per capita", label = scales::comma) +
facet_wrap(~continent) +
# theme_pander(base_size = 16) +
theme(panel.border = element_rect(color = "grey90", fill = NA)) +
# Here comes the gganimate-specific bits
labs(title = "Year: {frame_time}") +
ylab("Life Expectancy") +
transition_time(year) +
ease_aes("linear")

R + Maps

Getting started with

setup for EC 320

Installation

You need to install 2 pieces of software:

For explicit instructions for how to install, follow this tutorial


Note: /RStudio installations differ by operating system

R setup for EC 320

v. RStudio

  • The programming language (ie english, spanish, french etc.)
  • Ex. The engine, chassis, wheels, etc. of a car
  • The Integrated Development Environment (IDE) (ie word processor)
  • Ex. The dashboard containing various buttons and monitors

works without RStudio

RStudio doesn’t work without

R basics

You will dive deeper in lab, but here six big points about :

  1. Everything is an object

  2. Every object has a name and value

  3. You use functions on these objects

  4. Functions come in libraries (packages)

  5. R will try to help you

  6. R has its quirks

foo

foo <- 2

mean(foo)

library(dplyr)

?dplyr

NA; error; warning

Next class: Statistics review

Table of contents